euMMD: efficiently computing the MMD two-sample test statistic for univariate data
نویسندگان
چکیده
Abstract The maximum mean discrepancy (MMD) test is a nonparametric kernelised two-sample that, when using characteristic kernel, can detect any distributional change between two samples. However, the total number of $$d$$ d -dimensional observations $$n$$ n , direct computation statistic $$\mathcal {O}(dn^2 )$$ O ( 2 ) . While approximations with lower computational complexity are known, more efficient methods for computing exact unknown. This paper provides an method MMD univariate case in {O}(n\log n)$$ log Laplacian kernel. Furthermore, this extended to approximate real-valued data also log-linear observations. Experiments show that have good statistical performance compared test, particularly cases where $$d> n$$ >
منابع مشابه
Asymptotic algorithm for computing the sample variance of interval data
The problem of the sample variance computation for epistemic inter-val-valued data is, in general, NP-hard. Therefore, known efficient algorithms for computing variance require strong restrictions on admissible intervals like the no-subset property or heavy limitations on the number of possible intersections between intervals. A new asymptotic algorithm for computing the upper bound of the samp...
متن کاملTwo-Sample Median Test for Vague Data
Classical statistical tests may be sensitive to violations of the fundamental model assumptions inherent in the derivation and construction of these tests. It is obvious that such violations are much more probable in the presence of vague data. Thus nonparametric tests seem to be promising statistical tools. A generalization of the median test for the two-sample problem with vague data is sugge...
متن کاملTest for Exponentiality Based on the Sample Covariance
This paper proposes a simple goodness-of-fit test based on the sample covariance. It is shown that this test is preferable for alternatives of increasing and unimodal failure rate. Critical values for various sample sizes are determined by means of Monte Carlo simulations. We compare the test based on the sample covariance with tests based on Hoeffding's maximum correlation. The usefulness o...
متن کاملA multivariate two-sample mean test for small sample size and missing data.
We develop a new statistic for testing the equality of two multivariate mean vectors. A scaled chi-squared distribution is proposed as an approximating null distribution. Because the test statistic is based on componentwise statistics, it has the advantage over Hotelling's T2 test of being applicable to the case where the dimension of an observation exceeds the number of observations. An appeal...
متن کاملEfficiently Computing Data-Independent Memory-Hard Functions
A memory-hard function (MHF) f is equipped with a space cost σ and time cost τ parameter such that repeatedly computing fσ,τ on an application specific integrated circuit (ASIC) is not economically advantageous relative to a general purpose computer. Technically we would like that any (generalized) circuit for evaluating an iMHF fσ,τ has area × time (AT) complexity at Θ(σ ∗ τ). A data-independe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistics and Computing
سال: 2023
ISSN: ['0960-3174', '1573-1375']
DOI: https://doi.org/10.1007/s11222-023-10271-x